Our project, "A Lot on Our Tectonic Plates," delves into seismic activity and its profound implications, aiming to enhance our understanding of earthquakes and contribute to a safer world. We aim to understand the causes, effects, and patterns of earthquakes by utilizing data, technology, and scientific expertise. We aspire to enhance our comprehension of earthquake behavior and its impact on various industry sectors, ultimately striving to create a safer and more resilient world in the face of this powerful natural phenomenon.
Earthquakes, unpredictable and destructive, shape landscapes and impact communities. The project explores earthquake behavior, emphasizing contextual factors like tectonic plate movements and fault line activity. Our motivation stems from earthquakes being among the deadliest disasters, causing extensive structural and economic damage. For instance, the earthquake in Turkey and Syria in February 2023 resulted in a staggering 56,259 deaths and economic losses ranging from US 10 billion dollars to US 100 billion dollars. Recovering from such a massive loss would undoubtedly take these countries years. Earthquakes not only trigger secondary events like tsunamis and landslides but also leave lasting economic scars on affected regions. By delving into earthquake patterns and comprehensively analyzing their economic impacts, we seek to minimize future industrial losses significantly. This information is critical for advocating more effective precautionary measures, enhancing disaster preparedness, and contributing to the development of more resilient societies in earthquake-prone regions.
Our selection of questions of interest is grounded in the overarching goal of our project. Each question was chosen based on its significance in addressing different facets of earthquake occurrence, from natural causes to human-induced activities.
Trend in Earthquake Activity Over Time :
Significance: Recognizing patterns in earthquake occurrence helps in long-term preparedness and resource planning.
Impact: Identifying trends over time informs whether seismic activity is increasing, aiding in disaster mitigation strategies.
Expectation: Anticipating an increasing trend in earthquake frequency over time, influenced by technological advancements and potential climate change impacts. Note: Dataset limitations may skew findings towards higher magnitude earthquakes.
Mining and Nuclear Explsions Impact on Earthquakes :
Significance: Addressing human-induced seismicity is essential for responsible resource extraction.
Impact: Knowing the correlation between mining/blasting and seismic activity contributes to sustainable practices and risk reduction.
Expectation: Overall, a correlation between mining/blasting and increased seismic activity is expected, although limitations in our dataset focusing on earthquakes above magnitude 4 may not fully capture mining-induced seismic events.
Nuclear Power Plants and Earthquake Impact :
Significance: Understanding the potential connection between nuclear activities and seismic events is critical for safety.
Impact: Findings guide safety measures around nuclear plants, ensuring minimal risk of earthquakes triggered by such activities.
Expectation: Foreseeing a link between nuclear explosions and seismic activity, but the relationship with earthquake magnitude remains unclear.
Regions Affected by Severe Earthquakes :
Significance: Prioritizing earthquake-prone regions is crucial for effective disaster preparedness.
Impact: Understanding the most affected areas informs resource allocation and drills, aiding vulnerable communities.
Expectation: The Ring of Fire is anticipated to experience the most severe earthquakes due to its tectonic activity, guiding prioritization of drills and resource allocation.
In summary, these questions were chosen to address the diverse aspects of earthquake occurrence, combining natural and human-induced factors. The anticipated findings aim to provide actionable insights for disaster preparedness, risk reduction, and the development of resilient communities in earthquake-prone regions.
The Dataset is obtained from the USGS Earthquake website (USGS), covering global earthquake data from 2000 to 2023 with 321,789 observations, 22 variables including time, location, depth, magnitude, and event type.
The variables in the dataset and their descriptions are as follows:
| Column Name | Description |
|---|---|
| time | The time at which the earthquake occurred |
| latitude | The latitude of the earthquake |
| longitude | The longitude of the earthquake |
| depth | Depth of the event in kilometers |
| mag | Richter Magnitude of the earthquake |
| magType | The method or algorithm used to calculate the preferred magnitude for the event |
| nst | The total number of seismic stations used to determine earthquake location |
| gap | The largest distance between adjacent seismic stations measuring the earthquake (in degrees) |
| dmin | Horizontal distance from the epicenter to the nearest station (in degrees) |
| rms | Measure of the observed arrival times to the predicted arrival times for this location |
| id | A unique identifier for the event |
| updated | Time when the event was most recently updated |
| net | The ID of a data contributor |
| place | Geographic region near the event |
| type | Type of seismic event (e.g., blasting, earthquake, explosion, etc.) |
| horizontalError | Uncertainty of reported location of the event in kilometers |
| depthError | Uncertainty of reported depth of the event in kilometers |
| magError | Uncertainty of reported magnitude of the event. The estimated standard error of the magnitude |
| magNst | The total number of seismic stations used to calculate the magnitude for this earthquake |
| status | Status is either automatic or reviewed. Automatic events are directly posted by automatic processing systems and have not been verified or altered by a human |
| locationSource | The network that originally authored the reported location of this event |
| magSource | Network that originally authored the reported magnitude for this event |
We choose Data Analysis as the primary focus for grading due to the intricate nature of our earthquake study. While proficient data processing is essential, the complexity of seismic data necessitates advanced analytical techniques to uncover meaningful patterns, correlations, and insights. Our project goes beyond routine data processing by employing sophisticated methods to decipher nuanced trends, contributing to a deeper understanding of earthquake occurrences and their implications for disaster preparedness and mitigation.
Furthermore, our emphasis on data analysis is justified by the multifaceted questions of interest, which require nuanced interpretations and correlations between seismic events and various influencing factors. The intricate exploration of trends over time, human activities, and geographical patterns demands a sophisticated analytical approach, showcasing the project's commitment to deriving comprehensive and valuable insights from the dataset.
As the dataset considered is huge and consists of different anomalies, we will first need to determine the different ways to clean the data. In order to do that, let us understand how our dataset looks. We will be importing the below python libraries and using them throughout the project for various tasks.
# The numpy and pandas libraries are used for data processing and cleaning
# Matplotlib and Seaborn are use to create visualizations
# the re (regex) is used for text processing
# datetime and dateutil.parser converts string to datetime objects
# basemap and folium are used to create map plots
import pandas as pd
import numpy as np
from numpy import nan as NA
import datetime as dt
from dateutil.parser import parse
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
import folium
from geopy.distance import geodesic
from folium.plugins import MarkerCluster
from folium import IFrame
from mpl_toolkits.basemap import Basemap
import re
The first step is to load the data from the file into a data frame and have a good look at it.
# readcsv method will create a dataframe to enable us to use the excel file for Analysis
earthquake = pd.read_csv('Earthquake_2000_to_2023 Original.csv')
# head() method will display first five rows of the dataset
earthquake.head()
| time | latitude | longitude | depth | mag | magType | nst | gap | dmin | rms | ... | updated | place | type | horizontalError | depthError | magError | magNst | status | locationSource | magSource | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2000-12-31T23:50:35.500Z | 52.317 | 160.552 | 33.0 | 4.3 | mb | 14.0 | NaN | NaN | 1.16 | ... | 2014-11-07T01:11:48.035Z | 151 km ESE of Petropavlovsk-Kamchatsky, Russia | earthquake | NaN | NaN | NaN | 3.0 | reviewed | us | us |
| 1 | 2000-12-31T23:42:59.690Z | -26.547 | -107.261 | 10.0 | 5.4 | mwc | 80.0 | NaN | NaN | 0.84 | ... | 2016-11-10T00:20:01.175Z | 225 km ENE of Hanga Roa, Chile | earthquake | NaN | NaN | NaN | NaN | reviewed | us | hrv |
| 2 | 2000-12-31T22:07:00.300Z | -38.530 | 178.930 | 91.0 | 4.0 | ml | 11.0 | NaN | NaN | NaN | ... | 2014-11-07T01:11:48.020Z | 81 km E of Gisborne, New Zealand | earthquake | NaN | NaN | NaN | NaN | reviewed | wel | wel |
| 3 | 2000-12-31T21:56:50.900Z | -38.040 | 178.800 | 33.0 | 5.3 | mwc | 58.0 | NaN | NaN | NaN | ... | 2022-04-29T19:03:06.871Z | 97 km NE of Gisborne, New Zealand | earthquake | NaN | NaN | NaN | NaN | reviewed | wel | hrv |
| 4 | 2000-12-31T16:12:34.400Z | -56.179 | -27.217 | 100.0 | 4.5 | mb | 14.0 | NaN | NaN | 0.54 | ... | 2014-11-07T01:11:48.004Z | South Sandwich Islands region | earthquake | NaN | NaN | NaN | 2.0 | reviewed | us | us |
5 rows × 22 columns
# Visualizing the last five rows of data
earthquake.tail()
| time | latitude | longitude | depth | mag | magType | nst | gap | dmin | rms | ... | updated | place | type | horizontalError | depthError | magError | magNst | status | locationSource | magSource | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 321784 | 2022-10-10T00:54:02.984Z | 52.1613 | -170.4070 | 45.180 | 4.8 | mwr | 123.0 | 123.0 | 1.815 | 0.78 | ... | 2022-12-17T22:53:48.040Z | 135 km SW of Nikolski, Alaska | earthquake | 7.25 | 5.613 | 0.075 | 17.0 | reviewed | us | us |
| 321785 | 2022-10-10T00:46:57.010Z | -58.7992 | -24.1658 | 10.000 | 4.3 | mb | 12.0 | 179.0 | 8.171 | 0.74 | ... | 2022-12-17T22:54:09.040Z | South Sandwich Islands region | earthquake | 12.14 | 1.924 | 0.168 | 10.0 | reviewed | us | us |
| 321786 | 2022-10-10T00:44:47.962Z | 54.0830 | -35.0767 | 10.000 | 4.6 | mb | 78.0 | 57.0 | 8.996 | 0.43 | ... | 2022-12-17T22:54:08.040Z | Reykjanes Ridge | earthquake | 9.80 | 1.871 | 0.071 | 59.0 | reviewed | us | us |
| 321787 | 2022-10-10T00:26:17.442Z | 38.8416 | 142.1723 | 45.627 | 4.6 | mwr | 118.0 | 122.0 | 2.124 | 0.55 | ... | 2022-12-17T22:53:48.040Z | 47 km ESE of ?funato, Japan | earthquake | 7.96 | 5.771 | 0.063 | 24.0 | reviewed | us | us |
| 321788 | 2022-10-10T00:02:59.585Z | 42.0966 | 144.1664 | 35.000 | 4.9 | mb | 143.0 | 74.0 | 0.756 | 0.58 | ... | 2022-12-17T22:53:48.040Z | Hokkaido, Japan region | earthquake | 3.78 | 1.824 | 0.044 | 160.0 | reviewed | us | us |
5 rows × 22 columns
Now, for us to understand what data should be cleaned, we need to perform exploratory data analysis.
Here, we will
# Determines the number of rows and columns present in the datset
earthquake.shape
(321789, 22)
# Gives an overview of the type of data
earthquake.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 321789 entries, 0 to 321788 Data columns (total 22 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 time 321789 non-null object 1 latitude 321789 non-null float64 2 longitude 321789 non-null float64 3 depth 321789 non-null float64 4 mag 321789 non-null float64 5 magType 321789 non-null object 6 nst 185681 non-null float64 7 gap 286434 non-null float64 8 dmin 150241 non-null float64 9 rms 301990 non-null float64 10 net 321789 non-null object 11 id 321789 non-null object 12 updated 321789 non-null object 13 place 319870 non-null object 14 type 321789 non-null object 15 horizontalError 136798 non-null float64 16 depthError 215716 non-null float64 17 magError 149854 non-null float64 18 magNst 275295 non-null float64 19 status 321789 non-null object 20 locationSource 321789 non-null object 21 magSource 321789 non-null object dtypes: float64(12), object(10) memory usage: 54.0+ MB
# describe method will provide descriptive statistical mesures
earthquake.describe()
| latitude | longitude | depth | mag | nst | gap | dmin | rms | horizontalError | depthError | magError | magNst | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 321789.000000 | 321789.000000 | 321789.000000 | 321789.000000 | 185681.000000 | 286434.000000 | 150241.000000 | 301990.000000 | 136798.000000 | 215716.000000 | 149854.000000 | 275295.000000 |
| mean | 3.790758 | 38.417872 | 79.594908 | 4.536560 | 55.856313 | 112.002455 | 3.898871 | 0.873939 | 8.596409 | 9.630159 | 0.128768 | 28.063746 |
| std | 29.099115 | 120.665175 | 129.250402 | 0.420547 | 76.136256 | 52.375836 | 5.040551 | 0.302056 | 3.554687 | 795.508198 | 0.071220 | 48.633972 |
| min | -84.422000 | -179.999700 | -3.290000 | 3.380000 | 0.000000 | 6.500000 | 0.000000 | -1.000000 | 0.000000 | -1.000000 | 0.000000 | 0.000000 |
| 25% | -18.020600 | -72.032000 | 10.000000 | 4.200000 | 16.000000 | 73.000000 | 1.193000 | 0.690000 | 6.200000 | 1.900000 | 0.078000 | 6.000000 |
| 50% | 0.859000 | 95.086500 | 33.000000 | 4.500000 | 29.000000 | 108.000000 | 2.443000 | 0.870000 | 8.200000 | 5.800000 | 0.118000 | 13.000000 |
| 75% | 28.157400 | 141.746000 | 77.300000 | 4.700000 | 61.000000 | 141.000000 | 4.509000 | 1.050000 | 10.700000 | 9.300000 | 0.162000 | 29.000000 |
| max | 87.386000 | 180.000000 | 735.800000 | 9.100000 | 934.000000 | 358.300000 | 64.498000 | 69.320000 | 99.000000 | 367558.100000 | 1.680000 | 941.000000 |
# Gives number of null values for all columns
earthquake.isnull().sum()
time 0 latitude 0 longitude 0 depth 0 mag 0 magType 0 nst 136108 gap 35355 dmin 171548 rms 19799 net 0 id 0 updated 0 place 1919 type 0 horizontalError 184991 depthError 106073 magError 171935 magNst 46494 status 0 locationSource 0 magSource 0 dtype: int64
# Determining the percentage of null values out of all the values
(earthquake.isnull().sum())/(earthquake.shape[0])*100
time 0.000000 latitude 0.000000 longitude 0.000000 depth 0.000000 mag 0.000000 magType 0.000000 nst 42.297282 gap 10.987013 dmin 53.310710 rms 6.152790 net 0.000000 id 0.000000 updated 0.000000 place 0.596354 type 0.000000 horizontalError 57.488292 depthError 32.963526 magError 53.430975 magNst 14.448598 status 0.000000 locationSource 0.000000 magSource 0.000000 dtype: float64
From the above tasks, we can clearly understand that there are a lot of null values for the columns nst, gap, dmin, horizontalerror, deptherror, magerror and magnst.
We are going to take up each of these columns and modify them as needed.
Cleaning the data would help us remove duplicate values, irrelevant rows and impute meaningful values wherever needed. Below are some of the sub-tasks associated with this:
We have indexed the data based on each earthquake’s unique ID and to ensure each earthquake data is unique and not repeated, we have droped the duplicate values, if any.
# Drop duplicates
before=len(earthquake.index)
print(f"The number of rows before: {before}")
earthquake.drop_duplicates(inplace=True)
after=len(earthquake.index)
print(f"The number of rows after: {after}")
print(f"We have sucessfully removed {before-after} rows.")
The number of rows before: 321789 The number of rows after: 321360 We have sucessfully removed 429 rows.
Let us study what columns are present in our dataset and whether we need them for analysis or not.
earthquake.columns
Index(['time', 'latitude', 'longitude', 'depth', 'mag', 'magType', 'nst',
'gap', 'dmin', 'rms', 'net', 'id', 'updated', 'place', 'type',
'horizontalError', 'depthError', 'magError', 'magNst', 'status',
'locationSource', 'magSource'],
dtype='object')
'updated', 'rms': These columns are not used in our analysis. Their presence does not
contribute to the earthquake data interpretation or the results we aim to achieve.
'horizontalError': This column has approximately 57% null values, making it
unreliable for precise analysis. The high proportion of missing data in this
column could lead to skewed or inaccurate interpretations, hence its removal.
Thus, we will be dropping these columns
Lower magnitude earthquakes (less than 4) are typically not considered significant for data analysis because they are less likely to cause noticeable damage or human impact, making them less relevant for assessing seismic hazards.
Source: "USGS Earthquake Hazards Program - Magnitude" at https://earthquake.usgs.gov/earthquakes/eventpage/glossary#mag
# Drop columns
earthquake.drop(['updated', 'horizontalError', 'rms'], axis=1, inplace=True)
# Drop rows with mag lower than 4
before=len(earthquake.index)
print(f"The number of rows before are: {before}")
earthquake.drop(earthquake[earthquake.mag<4].index, inplace=True)
after=len(earthquake.index)
print(f"The number of rows after are: {after}")
print(f"We have removed {before-after} that had a magnitude lower than 4.")
The number of rows before are: 321360 The number of rows after are: 321359 We have removed 1 that had a magnitude lower than 4.
What can be considered as a significant earthquake, is one with a gap greater than 180 and magnitude less than 5.5 in earthquake data analysis will not help focus on more significant seismic events, as smaller earthquakes with wide gaps may not provide meaningful insights.
# Drop rows where gap is greater than 180 and magnitude is less than 5.5
print(f"The number of rows before is: {len(earthquake.index)}")
print(f"The number of rows we have dropped: {len(earthquake[(earthquake.gap > 180) & (earthquake.mag < 5.5)].index)}")
earthquake.drop(earthquake[(earthquake.gap > 180) & (earthquake.mag < 5.5)].index, inplace=True)
The number of rows before is: 321359 The number of rows we have dropped: 30990
We will further drop rows where dmin is greater than or equal to 10 and magnitude is less than 5.5 in earthquake data analysis. This can be justified based on the practice of excluding less relevant or distant seismic events.
Source: (USGS)
# Drop rows where dmin is greater than or equal to 10 and magnitude is less than 5.5
print(f"We have dropped {len(earthquake[(earthquake.dmin >= 10) & (earthquake.mag < 5.5)].index)} rows.")
earthquake.drop(earthquake[(earthquake.dmin >= 10) & (earthquake.mag < 5.5)].index, inplace=True)
print(f"The number of rows after filtering are: {len(earthquake.index)}")
We have dropped 9005 rows. The number of rows after filtering are: 281364
# Drop rows where magerror is greater than 1 and depth is less than 0.6
print(f"We have dropped {len(earthquake[(earthquake.magError >= 1) & (earthquake.depth < 0.6)].index)} rows.")
earthquake.drop(earthquake[(earthquake.magError >= 1) & (earthquake.depth < 0.6)].index, inplace=True)
print(f"The number of rows after filtering are: {len(earthquake.index)}")
We have dropped 1 rows. The number of rows after filtering are: 281363
To effectively manage missing values in our earthquake dataset, we apply a consistent method of filling these gaps with mean values for specific columns. This strategy is employed for the following reasons:
'nst' and 'magNst' Columns:
We address the 'nst' (number of stations) and 'magNst' (number of stations reporting magnitude) columns by inserting their mean values for missing data. We do this for retaining the dataset's overall statistical characteristics. It ensures that our imputed values realistically represent the typical scenario of seismic event observation, crucial for analysis where the quantity of observational data is significant.
'magError' Column:
In the 'magError' column, representing the error in magnitude measurements, filling in missing values with the mean provides a uniform method to account for measurement uncertainties. This uniformity is vital for ensuring that each record in our dataset maintains a standard error estimation, pivotal for accurate and consistent seismic analysis.
'gap', 'dmin', and 'depthError' Columns:
For technical measurements such as 'gap' (angular gap between seismic stations), 'dmin' (distance to the nearest station), and 'depthError' (error in the depth measurement), utilizing the mean value to fill nulls assists in maintaining the integrity of the dataset. This approach ensures that the filled values are in line with the general tendencies observed in these technical aspects, thus preserving the dataset's reliability for in-depth seismic studies.
This methodology ensures that our dataset remains comprehensive and robust, crucial for any analytical processes or insights we wish to derive regarding seismic activities.
columns_to_fill = ['nst', 'magNst', 'magError', 'gap', 'dmin', 'depthError']
# Fill missing values in each column with its mean
for column in columns_to_fill:
earthquake[column].fillna(earthquake[column].mean(), inplace=True)
Here we have 23 distinct values in the magType column. To simplify the dataset for analysis,
We can create broader categories that group the magnitude types based on their method of calculation or the type of seismic waves they measure.
Let us understand how:
# Take out unique values of the column using the unique() method
distinct_magType = earthquake['magType'].unique()
print('Distinct Values in the magType Column - ', distinct_magType)
print(f'The number of distinct magnitude types in the column magType = {len(distinct_magType)}')
Distinct Values in the magType Column - ['mb' 'mwc' 'ml' 'md' 'mwb' 'mw' 'ms' 'mblg' 'mwr' 'mww' 'mlg' 'mh' 'm' 'mc' 'Mb' 'Md' 'mb_lg' 'Ml' 'ms_20' 'mlr' 'mwp' 'mlv' 'ml(texnet)' 'Mi'] The number of distinct magnitude types in the column magType = 24
We have categorised majorly into four types based on our research. Each of the below type determines a specific method through which the earthquake magnitude was caluculated.
Body Wave Magnitudes:
Category Name: "body wave"
Includes: mb, Mb, mb_lg, mblg
Surface Wave Magnitudes:
Category Name: "surface wave"
Includes: ms, ms_20
Moment Magnitudes:
Category Name: "moment"
Includes: mw, mwc, mwb, mwr, mww, mwp, ml, Ml, mlg
Duration Magnitudes:
Category Name: "duration"
Includes: md, Md, m, mh, mc, mlr, mlv, Mi
# Define the translation dictionary
magType_categories = {
'body wave': ['mb', 'Mb', 'mb_lg', 'mblg'],
'surface wave': ['ms', 'ms_20'],
'moment': ['mw', 'mwc', 'mwb', 'mwr', 'mww', 'mwp', 'ml', 'Ml', 'mlg'],
'duration': ['md', 'Md', 'm', 'mh', 'mc', 'mlr', 'mlv', 'Mi']
}
# Function to replace values in a column with the corresponding key from the dictionary
def replace_with_category(magType, magTypeDict):
for category, values in magTypeDict.items():
if magType in values:
return category
return magType
earthquake['magType'] = earthquake['magType'].apply(lambda magType: replace_with_category(magType, magType_categories))
earthquake['magType'].value_counts()
magType body wave 233890 moment 42054 duration 5224 surface wave 194 ml(texnet) 1 Name: count, dtype: int64
Now, Let's review the data after filtering and imputing.
# Displaying the top 10 rows of the dataset
earthquake.head(10)
| time | latitude | longitude | depth | mag | magType | nst | gap | dmin | net | id | place | type | depthError | magError | magNst | status | locationSource | magSource | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2000-12-31T23:50:35.500Z | 52.317 | 160.552 | 33.0 | 4.3 | body wave | 14.0 | 99.604162 | 2.865827 | us | usp000a708 | 151 km ESE of Petropavlovsk-Kamchatsky, Russia | earthquake | 9.909305 | 0.124552 | 3.000000 | reviewed | us | us |
| 1 | 2000-12-31T23:42:59.690Z | -26.547 | -107.261 | 10.0 | 5.4 | moment | 80.0 | 99.604162 | 2.865827 | us | usp000a707 | 225 km ENE of Hanga Roa, Chile | earthquake | 9.909305 | 0.124552 | 29.453822 | reviewed | us | hrv |
| 2 | 2000-12-31T22:07:00.300Z | -38.530 | 178.930 | 91.0 | 4.0 | moment | 11.0 | 99.604162 | 2.865827 | us | usp000a705 | 81 km E of Gisborne, New Zealand | earthquake | 9.909305 | 0.124552 | 29.453822 | reviewed | wel | wel |
| 3 | 2000-12-31T21:56:50.900Z | -38.040 | 178.800 | 33.0 | 5.3 | moment | 58.0 | 99.604162 | 2.865827 | us | usp000a704 | 97 km NE of Gisborne, New Zealand | earthquake | 9.909305 | 0.124552 | 29.453822 | reviewed | wel | hrv |
| 4 | 2000-12-31T16:12:34.400Z | -56.179 | -27.217 | 100.0 | 4.5 | body wave | 14.0 | 99.604162 | 2.865827 | us | usp000a6zz | South Sandwich Islands region | earthquake | 9.909305 | 0.124552 | 2.000000 | reviewed | us | us |
| 5 | 2000-12-31T14:47:12.700Z | -15.224 | -173.797 | 115.1 | 4.5 | body wave | 51.0 | 99.604162 | 2.865827 | us | usp000a6zw | 80 km N of Hihifo, Tonga | earthquake | 9.909305 | 0.124552 | 12.000000 | reviewed | us | us |
| 6 | 2000-12-31T13:04:57.260Z | -19.416 | -176.181 | 246.5 | 4.1 | body wave | 23.0 | 99.604162 | 2.865827 | us | usp000a6zr | 196 km WNW of Pangai, Tonga | earthquake | 9.909305 | 0.124552 | 5.000000 | reviewed | us | us |
| 7 | 2000-12-31T12:56:12.900Z | -12.901 | 166.822 | 50.4 | 4.8 | body wave | 38.0 | 99.604162 | 2.865827 | us | usp000a6zq | 133 km NW of Sola, Vanuatu | earthquake | 9.909305 | 0.124552 | 7.000000 | reviewed | us | us |
| 8 | 2000-12-31T12:05:06.470Z | -44.734 | -79.654 | 33.0 | 4.5 | body wave | 15.0 | 99.604162 | 2.865827 | us | usp000a6zn | Off the coast of Aisen, Chile | earthquake | 9.909305 | 0.124552 | 3.000000 | reviewed | us | us |
| 9 | 2000-12-31T08:56:39.780Z | 44.653 | 147.918 | 100.0 | 4.5 | body wave | 20.0 | 99.604162 | 2.865827 | us | usp000a6zj | 63 km S of Kuril’sk, Russia | earthquake | 9.909305 | 0.124552 | 2.000000 | reviewed | us | us |
The date and time in the dataset are combined in a column, we will be extracting them into seperate columns.
# function to extract the date of the earthqauke from the time column ignoring any null values
def extract_date(time):
if pd.notnull(time):
parts = time.split('T')
if len(parts) > 1:
return parts[0].strip()
return None
#create a new column date using the output of the previous function
earthquake['date'] = earthquake['time'].apply(extract_date)
# Parse string to datetime format
earthquake.time = earthquake.time.map(lambda x: parse(x))
# Displaying the top 5 rows with the new column 'Date'
earthquake.head()
| time | latitude | longitude | depth | mag | magType | nst | gap | dmin | net | id | place | type | depthError | magError | magNst | status | locationSource | magSource | date | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2000-12-31 23:50:35.500000+00:00 | 52.317 | 160.552 | 33.0 | 4.3 | body wave | 14.0 | 99.604162 | 2.865827 | us | usp000a708 | 151 km ESE of Petropavlovsk-Kamchatsky, Russia | earthquake | 9.909305 | 0.124552 | 3.000000 | reviewed | us | us | 2000-12-31 |
| 1 | 2000-12-31 23:42:59.690000+00:00 | -26.547 | -107.261 | 10.0 | 5.4 | moment | 80.0 | 99.604162 | 2.865827 | us | usp000a707 | 225 km ENE of Hanga Roa, Chile | earthquake | 9.909305 | 0.124552 | 29.453822 | reviewed | us | hrv | 2000-12-31 |
| 2 | 2000-12-31 22:07:00.300000+00:00 | -38.530 | 178.930 | 91.0 | 4.0 | moment | 11.0 | 99.604162 | 2.865827 | us | usp000a705 | 81 km E of Gisborne, New Zealand | earthquake | 9.909305 | 0.124552 | 29.453822 | reviewed | wel | wel | 2000-12-31 |
| 3 | 2000-12-31 21:56:50.900000+00:00 | -38.040 | 178.800 | 33.0 | 5.3 | moment | 58.0 | 99.604162 | 2.865827 | us | usp000a704 | 97 km NE of Gisborne, New Zealand | earthquake | 9.909305 | 0.124552 | 29.453822 | reviewed | wel | hrv | 2000-12-31 |
| 4 | 2000-12-31 16:12:34.400000+00:00 | -56.179 | -27.217 | 100.0 | 4.5 | body wave | 14.0 | 99.604162 | 2.865827 | us | usp000a6zz | South Sandwich Islands region | earthquake | 9.909305 | 0.124552 | 2.000000 | reviewed | us | us | 2000-12-31 |
If we observe the place column, we can also see that values are inconsistent. We shall make it consistent to be either a State in the United States or the name of a Country.
# Extract the last word from the 'place' column using a regular expression
earthquake.place = earthquake.place.str.extract(r', (\w+[\s\w]*)$')
# Remove the word ' region' from the 'place' column, if present
earthquake.place = earthquake.place.str.replace(' region', '')
Now, we can also see that we have various U.S. states denoted by their abbreviations. We shall also rename them to match their actual state names.
# Extract non-null values from the 'place' column and filter those containing two uppercase letters using regex
earthquake.place.dropna()[earthquake.place.dropna().str.contains('^[A-Z]{2}$')].value_counts()
place CA 469 MX 41 AK 14 NV 2 OK 1 WA 1 Name: count, dtype: int64
# Mapping state/country abbreviations to full names
state_dict = {
'CA': 'California',
'MX': 'Mexico',
'AK': 'Alaska',
'NV': 'Nevada',
'OK': 'Oklahoma',
'WA': 'Washington'
}
# Replace abbreviations with full names
earthquake.place = earthquake.place.replace(state_dict, regex=True)
# Show changed 'place' column's value counts
earthquake.place.value_counts()
place
Indonesia 34160
Japan 27502
Papua New Guinea 15535
Chile 12921
Philippines 12195
...
Belgium 1
Cameroon 1
Denmark 1
Marshall Islands 1
Kentucky 1
Name: count, Length: 226, dtype: int64
A crucial post-data cleaning step involves conducting an exploratory analysis of the dataset, enabling us to uncover key observations and gain insights that will guide the selection of appropriate methods to address our research questions.
#Exploratory Data Analysis
print("\nSummary Statistics:")
earthquake.describe()
Summary Statistics:
| latitude | longitude | depth | mag | nst | gap | dmin | depthError | magError | magNst | |
|---|---|---|---|---|---|---|---|---|---|---|
| count | 281363.000000 | 281363.000000 | 281363.000000 | 281363.000000 | 281363.000000 | 281363.000000 | 281363.000000 | 281363.000000 | 281363.000000 | 281363.000000 |
| mean | 4.931550 | 42.579743 | 82.893424 | 4.560977 | 60.608269 | 99.604162 | 2.865827 | 9.909305 | 0.124552 | 29.453822 |
| std | 28.767533 | 119.515324 | 131.235951 | 0.431818 | 60.679682 | 36.985163 | 1.584898 | 696.542434 | 0.046802 | 46.344296 |
| min | -84.422000 | -179.999700 | -3.290000 | 4.000000 | 0.000000 | 6.500000 | 0.000000 | -1.000000 | 0.000000 | 0.000000 |
| 25% | -17.388050 | -71.024200 | 10.000000 | 4.300000 | 27.000000 | 73.400000 | 2.556000 | 4.400000 | 0.121000 | 7.000000 |
| 50% | 1.304000 | 99.378000 | 33.630000 | 4.500000 | 60.608269 | 99.604162 | 2.865827 | 9.100000 | 0.124552 | 18.000000 |
| 75% | 30.004700 | 141.927050 | 85.200000 | 4.700000 | 60.608269 | 126.000000 | 2.865827 | 9.909305 | 0.124552 | 29.453822 |
| max | 87.386000 | 180.000000 | 735.800000 | 9.100000 | 934.000000 | 313.000000 | 39.730000 | 367558.100000 | 1.642000 | 941.000000 |
As we embark on this investigative journey, our goal is to unravel the intricacies of earthquakes, paving the way for informed strategies to mitigate their impact. By addressing key questions, we seek to not only comprehend the nature of seismic events but also lay the groundwork for effective solutions.
Co-relation Analysis : Uncovering correlations between seismic attributes is a fundamental step in correlation analysis using heatmaps. Our comprehension of earthquake dynamics is aided by these graphic representations, which offer rapid insights into changeable linkages. By beginning with this analysis, strategies for resilience in seismic regions can be guided by identifying patterns and strengths in linkages.
Heatmaps visually represent the correlation matrix of seismic attributes, revealing patterns and insights. The color-coded gradients quickly identify strong or weak correlations, providing visual clarity.
eq_correlation_matrix = earthquake.corr(numeric_only=True)
# Creatin a heatmap using seaborn
plt.figure(figsize=(9, 6))
sns.heatmap(eq_correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", linewidths=.5, mask=np.triu(eq_correlation_matrix))
plt.title('Correlation Heatmap for Earthquake')
plt.show()
Potential correlations between different seismic parameters, including gap, dmin (minimum distance to nearest station), rms (root mean square), and nst (number of reporting stations), will be revealed by the heatmap. These correlations may show trends in the relationships between various seismic attributes. We consider a few here with observation and drawing possible inferences from it:
Observation - The relationship between earthquake depth and magnitude is clearly inverse. Since there is a negative correlation between magnitude and depth, it can be inferred that deeper earthquakes typically have lower magnitudes while shallower earthquakes typically have higher magnitudes. This relationship makes sense in terms of general seismological knowledge.
Business Implication - Planning for Infrastructure Resilience : Magnitude and Depth of Correlation Enterprises, specifically those engaged in vital infrastructure (like energy, transportation, and utilities), might have to take into account the relationship between the depth and magnitude of earthquakes. Planning for infrastructure resilience could still be impacted by deeper earthquakes of possibly lesser magnitudes. The design and retrofitting of structures to withstand varied magnitudes and depths can be informed by an understanding of the characteristics of seismic events.
Observation - There might be a positive correlation between the number of reporting stations and the magnitude of earthquakes, according to the correlation matrix (MagNst). Seismic stations may report and pay more attention to earthquakes of a higher magnitude, resulting in a positive correlation between these two variables.
Business Implication - Risk assessment and insurance: When evaluating risk associated with seismic damage, insurance companies may take into account the relationship between the number of reporting stations and the magnitude of the earthquake. Greater potential for damage may be indicated by higher magnitude earthquakes that draw more reporting stations, which could have an impact on risk assessment models and insurance rates.
Let us we delve into the temporal aspects of earthquake occurrences to discern patterns and trends over time and address our first concern.
Q1. Is there a noticeable trend or pattern in earthquake activity over time? If so, what is causing the trend?
To gain comprehensive insights into the nature of earthquakes, we employ two pivotal visualizations: a line chart and a histogram. These will provide insights into the severity and frequency of earthquakes.
Line Chart: Frequency of Earthquakes Over Time
Similar to the above graph the line chart here will showcase the distribution of earthquake frequency over different time intervals. This analysis aims to uncover patterns in the occurrence of seismic events and identify potential clusters or spikes.
# create a new df for time series analysis
quake_ts = earthquake.copy()
# Resetting the index only if it's already set
if not quake_ts.index.name == 'index':
quake_ts.reset_index(drop=True, inplace=True)
# Create bins
bins = np.linspace(start=1999, stop=2023, num=9)
# Extract the year
quake_ts['year'] = quake_ts['time'].apply(lambda t: int(str(t).split('-')[0]))
# Create bins based on the 'bins' column
quake_ts['Year Bin'] = pd.cut(quake_ts['year'], bins)
# Get the counts for each bin
bin_counts = quake_ts['Year Bin'].value_counts().sort_index()
# Extract bin labels for x-axis
bin_labels = [f'{int(start)}-{int(stop)}' for start, stop in zip(bins[:-1], bins[1:])]
# Create marker labels as whole numbers
marker_labels = [f"{count:.0f}" for count in bin_counts]
# Create a Plotly line graph
fig = go.Figure()
fig.add_trace(go.Scatter(x=bin_labels, y=bin_counts, mode='lines+markers', marker=dict(size=8),
line=dict(shape='linear', color='blue'), text=marker_labels))
# Customize layout
fig.update_layout(title='Frequency of Earthquakes Over Time', title_x=0.5,
xaxis=dict(title='Year Bins'),
yaxis=dict(title='Number of Earthquakes'), height=600,
autosize=True, # Set autosize to True for width adjustment
)
# Show the plot
fig.show()
From this visualization, we can see a slightly increasing trend in the frequency of earthquakes over the 23 year period. This aligns with our expected findings. There are 2 causes for the increase in the number of earthquakes overtime.
Overall, the number of earthquakes over the last 20+ years has increased due to a combination of worsening climate change and advancements in earthquake detection technology. To read more about these topic refer to the citations at the end of this notebook.
In this section of the analysis, we delve into the various types of seismic activities that have occured in the these years. We aim to understand the frequency and magnitude associated with each type of seismic event. By examining this information, we gain valuable insights into the diverse nature of seismic occurrences, providing a comprehensive overview of the seismic landscape. The exploration of seismic types, their frequency, and corresponding magnitudes contributes to a deeper understanding of the patterns and characteristics.
Box Plot : Magnitude and Frequency Distribution of Various Sesmic Events
The box plot serves as a powerful tool for comparing the magnitude distribution of various earthquake types, offering insights into their characteristic seismic strengths.
fig = px.box(earthquake, x='type', y='mag', color='type',
labels={'type': 'Earthquake Type', 'mag': 'Magnitude'})
# Remove legend
fig.update_traces(showlegend=False)
# Adjust the size and layout of the plot
fig.update_layout(
height=600,
autosize=True, # Set autosize to True for width adjustment
plot_bgcolor='rgba(192, 192, 192, 0.25)',
title='Box Plot of Magnitudes for Different Earthquake Types', title_x=0.5
)
fig.show()
Distribution of Magnitude and Central Tendency - Explosions are observed to have a lower median magnitude than earthquakes. Knowledge of the typical magnitude distribution can impact risk assessments and safety procedures for companies operating in areas where industrial activities or explosions are frequent. Depending on the anticipated seismic impact, it might be essential to modify emergency response plans and infrastructure.
Range Variability in Magnitude - There is more variation in the magnitude range of earthquakes and explosions, according to the box plot. Companies may need to take the possible variability in seismic events into account, especially those in the building, infrastructure, and real estate sectors. This knowledge can improve overall resilience by influencing the design and construction of structures to withstand a variety of magnitudes.
Planning for Infrastructure and Resilience - The need for adaptable infrastructure planning is suggested by the variation in magnitude distributions among the various types. Companies engaged in the planning of critical infrastructure, like those in the energy and utility sectors, may need to create systems and structures that take into consideration the wide range of seismic magnitudes. Because of its adaptability, overall resilience is improved and vital services are maintained through a range of seismic events.
We now turn our attention to the potential influence of human activities, specifically mining and blasting, on seismic events. The objective is to scrutinize whether these activities correlate with an increase in the frequency and severity of earthquakes.
Q2. Has mining/blasting had any impact on the frequency and/or severity of sesmic events? If so, what is the relationship?
Hypothesis: We hypothesize a positive correlation between mining/blasting activities and seismic events. Limitation: Due to our dataset focusing on earthquakes with a magnitude greater than 4, it's essential to acknowledge potential underrepresentation of lower-magnitude events induced by mining.
Moving forward, we explore the potential adverse impacts of nuclear power plants on earthquake activity, seeking to unveil any discernible relationships.
Expectation: Anticipating a link between nuclear plant explosions and increased seismic activity. Complexity: Understanding the intricate relationship between nuclear activities and the magnitude of resulting earthquakes.
# Define URLs for mining and nuclear icons
nuclear_icon = 'https://em-content.zobj.net/source/google/387/rocket_1f680.png'
mining_icon = 'https://em-content.zobj.net/source/samsung/380/collision_1f4a5.png'
# Filter man-made events from the earthquake DataFrame
manmade = earthquake[earthquake.type.str.contains('mining|mine|^explosion$|nuclear')]
manmade.type.value_counts()
# Initialize a folium map
world_map = folium.Map(location=[0, 0],
zoom_start=2,
attr='Mapbox',
max_bounds=True,
tiles='https://api.mapbox.com/styles/v1/mapbox/streets-v11/tiles/{z}/{x}/{y}?access_token=pk.eyJ1IjoiamFpbm0yNyIsImEiOiJjbHAwMTJrcDIwNHpwMnFvNDNwb3gyZ3V2In0.O4iM-p_iX83aK5bv4gdihQ',
)
# Define legend with the icon URLs
legend_items = {
'Nuclear Explosions': nuclear_icon,
'Mining Explosions': mining_icon,
}
# Create a legend using HTML
legend_html = '''
<div style="position: fixed;
bottom: 40px; left: 50px; width: 175px; height: 70px;
border:2px solid grey; z-index:9999; font-size:14px;
background-color:white; text-align: center;">
'''
# Populate the legend with items
for event_type, image_url in legend_items.items():
legend_html += f'''
<img src="{image_url}" style="height:30px;width:30px;"> {event_type}<br>
'''
legend_html += '''
</div>
'''
# Add the legend to the map
world_map.get_root().html.add_child(folium.Element(legend_html))
# Plot markers for man-made events on the map
for index, row in manmade.iterrows():
lat, long = row['latitude'], row['longitude']
# Choose the appropriate image based on event type
if 'nuclear' in row.get('type', '').lower():
image_url = nuclear_icon
else:
image_url = mining_icon
# Extract event details
event_date = row['date']
event_mag = row['mag']
# Populate tooltip content
tooltip_data = f'Event Date: {event_date}<br>Magnitude: {event_mag}'
# Set icon size for markers
icon = folium.CustomIcon(image_url, icon_size=(30, 30))
# Add markers to the map
folium.Marker(location=[lat, long], icon=icon, tooltip=tooltip_data).add_to(world_map)
# Add title to the map
title_html = '''
<h3 align="center" style="font-size:20px"><b>Map of Mining and Nuclear Explosions</b></h3>
'''
world_map.get_root().html.add_child(folium.Element(title_html))
# Display the map
world_map
In the interactive map above, it can be observed by zooming in that there is a concentration of explosions in Wyoming. Upon further investigation, it was discovered that these explosions all occurred due to mining. This is likely because Wyoming has been the top producer of coal in the United States since 1986, producing more than 40% of annual US coal supply through mining (Wyoming State Geological Survey, n.d.).
Mining explosions in Wyoming history are far from uncommon. Throughout the 1800s to late 1900s, multiple fatal mining accidents occurred killing many. Most notably, over 100 miners lost their lives in the Hanna mine explosion in 1903 (Rea, 2014).
However, our map only shows explosions in the 21st century. Due to the more current nature of our data, we can conclude that due to increased safety measures the severity of mining explosions has decreased as compared to the 1900s. This decrease is also likely due to a shift away from the use of coal in the electric power industry. Overall, this shift away from coal is a positive shift in the context of seismic activity due to the potential disastrous effects of seismic activity that is caused by mining.
Another area of concentration on the map above is in North Korea. Six nuclear explosions between 2006 and 2017 occurred in North Korea. Upon further investigation, it became clear that all 6 of the occurrences were the result of nuclear weapons testing (BBC.). This discovery is a concern for any government that plans to conduct nuclear missile testing because any missile that is tested will cause seismic activity.
In this segment of our analysis, we try to understand the critical question:
Q3. Has there been any adverse impact of earthquake activity on nuclear power plants? If so, what is the relationship?
This inquiry is paramount for understanding the potential implications of nuclear power generation on seismic events. By examining data and exploring patterns, we aim to shed light on whether earthquakes contribute to or influence nuclear power plants. This investigation is crucial for ensuring the safety and resilience of regions with nuclear facilities and informs strategies for risk mitigation in the context of seismic events.
For this, we are considering a new dataset that provides information on nuclear power plants worldwide, sourced from reputable outlets like Declan Butler of Nature News and the International Atomic Energy Agency's Power Reactor Information Systems. The data includes essential details like plant location (Longitude, Latitude), region, country, number of reactors, and the affected population. Our analysis will concentrate on the plant's name, longitude, and latitude to compare it to get any relationship it has with earthquakes.
nuclear = pd.read_csv('energy-pop-exposure-nuclear-plants-locations_plants.csv')
nuclear.head()
| FID | Region | Country | Plant | NumReactor | Latitude | Longitude | p90_1200 | p00_1200 | p10_1200 | ... | p10r_600 | p90_300 | p00_300 | p10_300 | p90u_300 | p00u_300 | p10u_300 | p90r_300 | p00r_300 | p10r_300 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | Europe - Western | SWEDEN | AGESTA | 1 | 59.206022 | 18.082872 | 187382000 | 188684000 | 188250000 | ... | 8972550.0 | 5013240.0 | 5227700.0 | 5471110.0 | 3426880.0 | 3596030.0 | 3764920.0 | 1586350.0 | 1631670.0 | 1706190 |
| 1 | 1 | Europe - Western | SPAIN | ALMARAZ | 2 | 39.808100 | -5.696940 | 136675000 | 147718000 | 163429000 | ... | 19453700.0 | 17756500.0 | 18187800.0 | 20185200.0 | 10986000.0 | 11415400.0 | 12689800.0 | 6770480.0 | 6772380.0 | 7495340 |
| 2 | 2 | America - Latin | BRAZIL | ANGRA | 3 | -23.007857 | -44.458098 | 99195200 | 113894000 | 127898000 | ... | 18605600.0 | 39546400.0 | 44701700.0 | 50210600.0 | 32788500.0 | 37064600.0 | 41648300.0 | 6757940.0 | 7637110.0 | 8562300 |
| 3 | 3 | America - Northern | UNITED STATES OF AMERICA | ARKANSAS ONE | 2 | 35.310320 | -93.231289 | 117830000 | 132729000 | 146482000 | ... | 9498240.0 | 5603180.0 | 6226360.0 | 6866840.0 | 3779400.0 | 4198920.0 | 4633770.0 | 1823770.0 | 2027450.0 | 2233070 |
| 4 | 4 | Europe - Western | SPAIN | ASCO | 2 | 41.200000 | 0.566670 | 271854000 | 287134000 | 308922000 | ... | 17594700.0 | 14398500.0 | 15095600.0 | 16830700.0 | 11215400.0 | 11773600.0 | 13151300.0 | 3183180.0 | 3322050.0 | 3679470 |
5 rows × 61 columns
eq_nuclear = earthquake[earthquake.mag >= 6]
eq_nuclear.sort_values(by='mag', ascending=False)
# Initialize the folium map
reactor_map = folium.Map(location=[35, 139], zoom_start=5,
tiles='https://api.mapbox.com/styles/v1/mapbox/streets-v11/tiles/{z}/{x}/{y}?access_token=pk.eyJ1IjoiamFpbm0yNyIsImEiOiJjbHAwMTJrcDIwNHpwMnFvNDNwb3gyZ3V2In0.O4iM-p_iX83aK5bv4gdihQ',
attr='Mapbox', max_bounds=True)
# Function to categorize the earthquake colour on map based on magnitude
def earthquake_map_colors(magnitude):
if magnitude >= 8.0:
return '#ff9999', '#ff0000' # Red shade
elif magnitude >= 7.0:
return '#ffd699', '#ff8c00' # Orange shade
elif magnitude >= 6.0:
return '#ffff99', '#ffff00' # Yellow shade
# Function to categorise the radius of impact based on the magnitude (Returns the value in meters)
def impact_radius_kms(magnitude):
if magnitude >= 8.0:
return 500000
elif magnitude >= 7.0:
return 250000
elif magnitude >= 6.0:
return 100000
else:
return 0
# Function to check if an earthquake's impact radius contains nay nuclear plant
def nuclear_plant_vicinity_check(earthquake, radius, plants):
nearby_plants = []
for _, plant in plants.iterrows():
# Using geodesic and feeding two sets of locations to find the distance between them
distance = geodesic((earthquake['latitude'], earthquake['longitude']), (plant['Latitude'], plant['Longitude'])).kilometers
# If the the nuclear plant is within 1000 kms range we add the plant to our interest list
if distance <= radius / 1000:
nearby_plants.append(plant)
return nearby_plants
# Passing the highest magnitude earthquake for each location in a dictionary
highest_magnitude_earthquakes = {}
# Passing all nearby reactors to a list
all_nearby_plants = []
# Using the earthquake data to start our iteration
for _, eq in eq_nuclear.iterrows():
# Passing the location as Tuple
location = (eq['latitude'], eq['longitude'])
impact_radius = impact_radius_kms(eq['mag'])
nearby_plants = nuclear_plant_vicinity_check(eq, impact_radius, nuclear)
# Looking for the highest magnitude earthquake at every location
if nearby_plants:
if (location not in highest_magnitude_earthquakes or eq['mag'] > highest_magnitude_earthquakes[location]['mag']):
highest_magnitude_earthquakes[location] = eq
all_nearby_plants.extend(nearby_plants) # Add the point of interest plants to the list
# Sorting the earthquakes and keeping the highest magnitudes on the top
sorted_earthquakes = sorted(highest_magnitude_earthquakes.values(), key=lambda x: x['mag'], reverse=False)
# Plotting the values on the map
for eq in sorted_earthquakes:
impact_radius = impact_radius_kms(eq['mag'])
shadow_color, marker_color = earthquake_map_colors(eq['mag'])
# Mapping the Impact circle
folium.Circle(
location=[eq['latitude'], eq['longitude']],
radius=impact_radius,
color=shadow_color,
fill=True,
fill_opacity=0.3,
weight=0
).add_to(reactor_map)
# Plotting the earthquake hover data
folium.CircleMarker(
location=[eq['latitude'], eq['longitude']],
radius=4,
color=marker_color,
fill=True,
fill_color=marker_color,
tooltip=f'Earthquake: Mag {eq["mag"]} <br>Date: {eq["date"]}'
).add_to(reactor_map)
# Fetching all the unique nuclear plants in the range of earthquake
unique_affected_nuclear_plants = {plant['Plant']: plant for plant in all_nearby_plants}.values()
# Plot only the affected nuclear plants with a custom icon
icon_url = r'https://i.imgur.com/i4sPkgI.png' # Passing the icon url that points each reactor on map
for plant in unique_affected_nuclear_plants:
icon = folium.CustomIcon(icon_url, icon_size=(20, 20)) # Preparing a custom sized icon
folium.Marker(
location=[plant['Latitude'], plant['Longitude']],
icon=icon,
tooltip=f'Nuclear Plant: {plant["Plant"]}'
).add_to(reactor_map)
legend_html = '''
<div style="position: fixed;
bottom: 50px; left: 50px; width: 150px; height: auto;
border:2px solid grey; z-index:9999; font-size:14px;
background: white; padding: 5px; border-radius: 6px;
box-shadow: 0 0 10px rgba(0,0,0,0.5);">
<b>Earthquake Magnitude</b> <br>
<span style="height:10px;width:10px;background-color:#ff9999;display:inline-block;border-radius:50%;"></span> >= 8.0<br>
<span style="height:10px;width:10px;background-color:#ffd699;display:inline-block;border-radius:50%;"></span> 7.0 - 7.9<br>
<span style="height:10px;width:10px;background-color:#ffff99;display:inline-block;border-radius:50%;"></span> 6.0 - 6.9<br>
<b>Nuclear Power Plant</b><br>
<img src="https://i.imgur.com/i4sPkgI.png" style="width: 20px; height: 20px;"/> Nuclear Plant<br>
</div>
'''
# Add title to the map
title_html = '''
<h3 align="center" style="font-size:20px"><b>Map of Nuclear Power Plants within the impact radius of Earthquakes</b></h3>
'''
reactor_map.get_root().html.add_child(folium.Element(title_html))
# Add the legend to the map
reactor_map.get_root().html.add_child(folium.Element(legend_html))
reactor_map
The map below shows nuclear power plants that fall within an impact radius of an earthquake. The impact radius shows the area an earthquake reaches based on its magnitude. For instance, a higher magnitude earthquake will reach a greater geographical area and thus have a bigger impact radius and a bigger circle on the graph. The opaque circles on the graph represent the impact radius and the hollow circles represent the earthquake itself. The colors represent the severity of the magnitude of the earthquake. For example, the red circles indicate higher magnitude and the yellow circles represent lower magnitude. By hovering over the earthquakes, the magnitude and year of the earthquake can be seen.
Now that the map has been created and explained, more in depth analysis can be done to gain insights on the research question at hand.
In the map above, it can be observed that the highest concentration of nuclear power plants is in Japan. This is becuase nuclear energy has been a national strategic priority for Japan since 1973 (World Nuclear Association, 2019). From our 'Earthquake Severity Around the World' map, it was already observed that Japan is prone is more prone to earthquakes as compared to other areas of the world.
From these 2 observations regarding Japan, the question arises: Do earthquakes in Japan, an earthquake prone country, have an impact on the nuclear power plants in Japan?
To answer this question, further research has been done on the earthquakes and power plants in Japan.
The most notable data point in Japan on this graph is the magnitude 9 earthquake. Upon further research, it became clear that this is the Great East Japan Earthquake. This earthquake caused 11 reactors at 4 power plants to shutdown automatically in accordance to emergency protocol (Nuclear Power Plants and Earthquakes - World Nuclear Association, 2021). However, at the Fukushima-Daiichi plant (view the graph above) an accident commenced due to a tsumani caused by earthquake. The power plant emergency protocols were not designed to withstand this type of natural disaster and as a result the entire accident was rated at a level 5 out of 7 on the International Nuclear and Radiological Event Scale (Nuclear Power Plants and Earthquakes - World Nuclear Association, 2021). The INES is a scale used for communicating the saftey significance of nuclear and radiological events.
This short case study on Japanese nuclear power plants reveals that building nuclear power plants in earthquake prone areas is a signifcant risk. Despite the protocols in place at the Fukushima-Daiichi plant, a level 5 accident still occurred.
Although many of the nuclear power plants on the above map have not been researched for the purposes of this analysis, an argument can be made that these nuclear power plants should ensure that proper emergency protocols are in place in the event that more earthquakes occur near the plant.
As we navigate through the intricacies of seismic events, our focus shifts to understanding the regions most susceptible to severe earthquakes.
Q4. Which regions of the world suffer most from severe earthquakes and how can we use this information to prioritize earthquake drills and resource allocation?
By pinpointing the regions that suffer most from severe earthquakes, we aim to enhance our preparedness strategies, ensuring that communities in high-risk zones are well-equipped to respond to and recover from seismic incidents. This analysis serves as a crucial foundation for developing targeted plans and initiatives that contribute to the safety and resilience of these earthquake-prone regions.
Interactive Map : Earthquake Severity Around the World
# Function to assign color based on earthquake magnitude
def magnitude_color(magnitude):
if 7 <= magnitude < 7.5:
return 'beige'
elif 7.5 <= magnitude < 8:
return 'orange'
elif 8 <= magnitude < 8.5:
return 'red'
elif 8.5 <= magnitude < 9:
return 'darkred'
else: # magnitude 9 and above
return 'purple'
# Here we are filtering the dataset for earthquakes with a magnitude of 7 or higher
severe_earthquakes_df = earthquake[earthquake['mag'] >= 7]
# Initialize map with a tile layer that uses English place names -
earthquake_map = folium.Map(location=[severe_earthquakes_df['latitude'].mean(), severe_earthquakes_df['longitude'].mean()],
tiles='https://api.mapbox.com/styles/v1/mapbox/streets-v11/tiles/{z}/{x}/{y}?access_token=pk.eyJ1IjoiamFpbm0yNyIsImEiOiJjbHAwMTJrcDIwNHpwMnFvNDNwb3gyZ3V2In0.O4iM-p_iX83aK5bv4gdihQ',
attr='Mapbox',
zoom_start=2, max_bounds=True)
# Create a MarkerCluster object
marker_cluster = MarkerCluster().add_to(earthquake_map)
# This function creates a custom popup marker -
def create_popup(row):
iframe = IFrame(f'<b>Magnitude:</b> {row["mag"]}<br>'
f'<b>Depth:</b> {row["depth"]} km<br>'
f'<b>Location:</b> {row["place"]}',
width=200, height=100)
return folium.Popup(iframe, max_width=300)
# Add markers to the cluster
for index, row in severe_earthquakes_df.iterrows():
folium.Marker(
location=[row['latitude'], row['longitude']],
icon=folium.Icon(color=magnitude_color(row['mag']), icon='info-sign'),
popup=create_popup(row)
).add_to(marker_cluster)
# Add Custom Legend as an HTML element
legend_html = '''
<div style="position: fixed;
bottom: 50px; left: 50px; width: auto; height: auto;
border:2px solid grey; z-index:9999; font-size:14px;
background: #ffffff69; padding: 5px; border-radius: 6px;
box-shadow: 0 0 10px rgba(0,0,0,0.5);">
<b>Earthquake Magnitude</b> <br>
<i class="fa fa-circle" style="color:beige"></i> 7 - 7.5<br>
<i class="fa fa-circle" style="color:orange"></i> 7.5 - 8<br>
<i class="fa fa-circle" style="color:red"></i> 8 - 8.5<br>
<i class="fa fa-circle" style="color:darkred"></i> 8.5 - 9<br>
<i class="fa fa-circle" style="color:purple"></i> >= 9
</div>
'''
earthquake_map.get_root().html.add_child(folium.Element(legend_html))
# Display the map
earthquake_map
The map highlights concentrated seismic activity with a particular emphasis on earthquakes that have a magnitude of 7 or higher, notably along the western coasts of North and South Americas and the Southeast Asia belt extending from Japan to New Zealand.
Subduction Zone : Subduction zones occur when one tectonic plate is forced beneath another.
Pacific Ring of Fire: The horseshoe-shaped Pacific Ring of Fire, encompassing 75% of the world's active volcanoes, exhibits frequent earthquakes along a path from South America to Japan and New Zealand. This dynamic region is characterized by the convergence of major tectonic plates.
Seismic Belt : The region from New Zealand to Japan linked to the Pacific Ring of Fire, experiences significant seismic activity due to subduction of oceanic plates beneath continental plates.
Ring of Fire Overview : The Pacific Ring of Fire encircles the Pacific Ocean basin and is marked by the convergence of several major tectonic plates, including the Pacific Plate, North American Plate, South American Plate, Eurasian Plate, and Indo-Australian Plate. The interaction of these plates, particularly through subduction zones, results in frequent seismic and volcanic events.
Global Impact : Notable earthquakes, such as the 2011 Tohoku quake in Japan and the 2004 Indian Ocean event, are associated with the Ring of Fire. This region's seismic and volcanic intensity has global implications, affecting nearby and distant areas alike.
Let's consider the measures taken by Japan, Indonesia, Philippines, and Chile in the context of earthquake mitigation measures:
Countries that are vulnerable to earthquakes, such as Chile, Indonesia, Japan, and the Philippines, have put in place a number of safeguards to lessen the effects of seismic activity on their people, economy, and infrastructure. Along with possible additional actions, these nations have either already taken or are considering the following corrective measures:
Let's consider the measures taken by Japan, Indonesia, Philippines, and Chile in the context of earthquake mitigation measures:
Countries vulnerable to earthquakes, including Chile, Indonesia, Japan, and the Philippines, have implemented safeguards to mitigate seismic impacts on people, economies, and infrastructure. Alongside existing measures, these nations have taken or are considering the following actions:
Economy : Japan boasts one of the world's largest and most advanced economies.
Earthquake Measures :
Suggestions :
Economy : Classified as a developing economy, Indonesia faces resource constraints despite ongoing expansion.
Earthquake Measures :
Suggestions :
Indonesia must regularly update its strategies, considering the evolving seismic risks and changing infrastructure and community landscape.
Economy : Positioned as a developing economy, the Philippines has a broad economic foundation and is in a state of development.
Earthquake Measures :
Suggestions :
A comprehensive approach will enhance the Philippines' resilience to seismic risks, ensuring the safety of its people, economy, and infrastructure in the face of disasters and climate challenges.
Economy : A stable and diversified economy makes Chile one of the most prosperous countries in South America.
Earthquake Measures :
Suggestions :
Chile can strengthen its resilience, protect its population, and promote sustainable development by continually adapting to the dynamic nature of seismic hazards.
In summary, our project thoroughly examined seismic activity from 2000 to 2023. Firstly, we found that there hasn't been a clear pattern in earthquake magnitudes throughout the 20th century, except for a spike in 2010 following the devastating earthquake in Haiti. We realized that this was not significant and hence we have chosen to not visualize the same. The number of earthquakes recorded has increased over 23 years, likely due to improved earthquake detection technology and climate change. Regarding the impact of mining and blasting, we noticed a high number of seismic events in Wyoming, a top coal producer in the US since 1986. While major explosions are rare nowadays, ongoing safety measures in mines are crucial. In response to the question about earthquakes affecting nuclear power plants, we highlighted the risk in countries like Japan with frequent earthquakes and numerous nuclear plants. The Fukushima-Daiichi incident in Japan in 2011 highlighted the potential dangers. We recommend avoiding placing nuclear plants in earthquake-prone areas to prevent significant harm. Lastly, we identified Japan, Indonesia, the Philippines, and Chile as regions with the most severe earthquakes. Our suggestions for these nations include increased funding for earthquake research and development, international collaboration to build global resilience, and the adoption of earthquake insurance and risk management.
USGS. “Search Earthquake Catalog.” Usgs.gov, 2019, earthquake.usgs.gov/earthquakes/search/
Earthquakes in a Warming World. Atmos. (2023, September 7). https://atmos.earth/earthquakes-in-a-warming-world/#:~:text=Today%2C%20earthquakes%20are%20becoming%20more
Halton, Mary. “Revolution in Quake Detection Technology.” BBC News, 4 July 2018, www.bbc.com/news/science-environment-44683284
Wyoming State Geological Survey. (n.d.). Www.wsgs.wyo.gov. https://www.wsgs.wyo.gov/energy/coal.aspx
Rea, T. (2014, November 8). Thunder under the House: One Family and the Hanna Mine Disasters | WyoHistory.org. Www.wyohistory.org. https://www.wyohistory.org/encyclopedia/thunder-under-house-one-family-and-hanna-mine-disasters#:~:text=Between%201912%20and%201938%2C%20160
BBC. (2019, February 25). North Korea’s missile and nuclear programme. BBC News. https://www.bbc.com/news/world-asia-41174689
World Nuclear Association. (2019). Nuclear Power in Japan | Japanese Nuclear Energy - World Nuclear Association. World-Nuclear.org. https://world-nuclear.org/information-library/country-profiles/countries-g-n/japan-nuclear-power.aspx
Nuclear Power Plants and Earthquakes - World Nuclear Association. (2021, March). World-Nuclear.org. https://world-nuclear.org/information-library/safety-and-security/safety-of-plants/nuclear-power-plants-and-earthquakes.aspx
culturetrip. (2018, January 10). Number of Ways Japan Prepares for Earthquakes. Culture Trip. https://theculturetrip.com/asia/japan/articles/8-ways-japan-prepares-for-earthquakes
Indonesia Tsunami Early Warning System (InaTEWS) | Department of Economic and Social Affairs. (n.d.). Sdgs.un.org. https://sdgs.un.org/partnerships/indonesia-tsunami-early-warning-system-inatews
PHIVOLCS Staff. (2016). Earthquake Monitoring. Dost.gov.ph. https://www.phivolcs.dost.gov.ph/index.php/earthquake/earthquake-monitoring
Earthquake and Tsunami in Chile: massive evacuation and building codes to reduce loss of life. (2023). Unesco.org. https://www.unesco.org/en/articles/earthquake-and-tsunami-chile-massive-evacuation-and-building-codes-reduce-loss-life#:~:text=Chile%20is%20actively%20involved%20in